Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ironic Standalone Operator #461

Merged
merged 1 commit into from
Oct 21, 2024

Conversation

dtantsur
Copy link
Member

This design proposal discusses ironic-standalone-operator: provides a
motivation for its creation, describes its goals and outlines the
current design and features, as well as the plans for the nearest
future.

Signed-off-by: Dmitry Tantsur dtantsur@protonmail.com

@metal3-io-bot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@metal3-io-bot metal3-io-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 25, 2024
@metal3-io-bot metal3-io-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 25, 2024
@dtantsur dtantsur force-pushed the ironic-standalone-operator branch 4 times, most recently from 27ec8ca to 04f201c Compare August 1, 2024 16:06
@dtantsur dtantsur marked this pull request as ready for review August 1, 2024 16:07
@metal3-io-bot metal3-io-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2024
@dtantsur dtantsur force-pushed the ironic-standalone-operator branch 2 times, most recently from 23135d5 to 7be12d1 Compare August 1, 2024 16:16
- `dhcp` - another nested structure with DHCP parameters (see below)
- `externalIP` - IP through which nodes deployed over virtual media access
Ironic and HTTPD
- `interface`, `ipAddress`, `macAddresses` - various ways to specify the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, how does provisioningInterface, provisioningIPAddress and provisioningMACAddress sound?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential for confusion here. We call "provisioning network" a dedicated network for the PXE and other provisioning traffic. Ironic can work without it, and the variables work regardless of whether you have a "provisioning network".

Maybe I'm overthinking it? Would like more opinions here as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do decide to change this, I'd rather rename the whole "networking" sub-field into something like "provisioningNetwork" to avoid duplicate prefixes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the rest of the sub-fields seem unrelated to the provisioning network (like dhcp and external ip)?
How would the interface, ipAddress and macAddresses fields work if there is no provisioning interface?
(I am not too familiar with ironic's networking, so just trying to understand the expected behaviour)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DHCP is very-very much related to the provisioning network: it's DHCP on the provisioning network. ExternalIP is not directly related indeed, maybe we'd need to move it out then.

How would the interface, ipAddress and macAddresses fields work if there is no provisioning interface?

There must be a provisioning interface, but it can be on some existing network rather than an isolated provisioning network. Confusing, I know. This is why I'm pondering our usage of "provisioning" here.

Copy link
Member

@Rozzii Rozzii Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also say to not use "provisioning", on Bare Metal Life Cycle management level the name "Provisioning Network" makes sense but if we just talk about an ironic instance not that much. Everything Ironic does with the machine goes through the provisioning network, ofc he BMC might have it's own separate network but ipxe, dhcp, bootp, http image server, IPA - Ironic communication they all happen in the "provisioning" network but from Ironic perspective it is simply "the Network".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

avoid running it 3+ times in parallel. Maybe it will take a form of a *Job*.
- BMO will need to be updated to handle more cases of unexpected provision
state. E.g. what needs to be done when the BMH is *inspecting* but the node
is found in a completely wrong state like `active` or `clean wait`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an additional complexity to consider if we're going to support mariadb combined with persistent volumes, when the mariadb version changes you either have to ensure clean shutdown of the container or enable some kind of backup/restore workflow due to the mariadb checks around versioning for the redo log - see here for example.

This can likely be resolved with with pod lifecycle hooks, or perhaps some kind of job pre/post upgrade but it's additional complexity which I don't think we currently consider in the community deployment examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point, thank you!

Copy link
Member

@Rozzii Rozzii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Copy link
Member

@Rozzii Rozzii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
From my perspective it is fine given that you will address the comments.

@metal3-io-bot metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 25, 2024
@Rozzii
Copy link
Member

Rozzii commented Sep 25, 2024

/hold
Just holding to avoid someone putting approve on it while my lgtm is active and merging it

@metal3-io-bot metal3-io-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 25, 2024
Copy link
Member

@kashifest kashifest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @dtantsur few questions/nits inline

design/ironic-standalone-operator.md Outdated Show resolved Hide resolved
design/ironic-standalone-operator.md Outdated Show resolved Hide resolved
design/ironic-standalone-operator.md Outdated Show resolved Hide resolved
its private key certificate and sign the certificate with this CA. The CA will
be trusted **only** for the RPC purpose, removing the possibility of abuse.
To reduce the number of code paths, this process will happen unconditionally,
even when TLS for Ironic itself is not enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding what prevents us from enabling TLS unconditionally for ironic itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, nothing really. We'd need to decide if we want to use a CertificateSigningRequest (and require a manual approval) or just generate a self-signed certificate. Opinions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the end goal would be to have an integration with cert-manager as in most K8s environments cert-manager is handling the certificates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only worry about the manual step (approving the cert request). Is it also considered standard?

I mean, in the situation when the admin has not provided any certificate via the secret.

design/ironic-standalone-operator.md Outdated Show resolved Hide resolved
@kashifest
Copy link
Member

I can approve it already and hold can be taken back when the reviews are addressed
/approve

@metal3-io-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kashifest, Rozzii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@metal3-io-bot metal3-io-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 25, 2024
This design proposal discusses ironic-standalone-operator: provides a
motivation for its creation, describes its goals and outlines the
current design and features, as well as the plans for the nearest
future.

Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com>
@metal3-io-bot metal3-io-bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2024
@Rozzii
Copy link
Member

Rozzii commented Oct 15, 2024

/lgtm
@dtantsur feel free to remove the hold if you feel the time is right.

@metal3-io-bot metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2024
Copy link
Member

@Rozzii Rozzii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@MahnoorAsghar
Copy link

LGTM

@dtantsur
Copy link
Member Author

@kashifest do you feel like your comments have been (at least mostly) addressed? The TLS one is probably the only outstanding (and we can discuss it later if you wish).

@kashifest
Copy link
Member

@kashifest do you feel like your comments have been (at least mostly) addressed? The TLS one is probably the only outstanding (and we can discuss it later if you wish).

I am fine with this as is. We can continue the TLS discussion as you suggested.
/hold cancel

@metal3-io-bot metal3-io-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 21, 2024
@metal3-io-bot metal3-io-bot merged commit c1b192d into metal3-io:main Oct 21, 2024
8 checks passed
@dtantsur dtantsur deleted the ironic-standalone-operator branch October 23, 2024 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants